A comprehensive study on the efficacy of a wearable sleep aid device featuring closed-loop real-time acoustic stimulation

Difficulty falling asleep is one of the typical insomnia symptoms. However, intervention therapies available nowadays, ranging from pharmaceutical to hi-tech tailored solutions, remain ineffective due to their lack of precise real-time sleep tracking, in-time feedback on the therapies, and an ability to keep people asleep during the night. This paper aims to enhance the efficacy of such an intervention by proposing a novel sleep aid system that can sense multiple physiological signals continuously and simultaneously control auditory stimulation to evoke appropriate brain responses for fast sleep promotion. The system, a lightweight, comfortable, and user-friendly headband, employs a comprehensive set of algorithms and dedicated own-designed audio stimuli. Compared to the gold-standard device in 883 sleep studies on 377 subjects, the proposed system achieves (1) a strong correlation (0.89 ± 0.03) between the physiological signals acquired by ours and those from the gold-standard PSG, (2) an 87.8% agreement on automatic sleep scoring with the consensus scored by sleep technicians, and (3) a successful non-pharmacological real-time stimulation to shorten the duration of sleep falling by 24.1 min. Conclusively, our solution exceeds existing ones in promoting fast falling asleep, tracking sleep state accurately, and achieving high social acceptance through a reliable large-scale evaluation.

from the EEG signal component, and a 38-dimensional feature vector is computed to summarize time and frequency domain analyses from each channel's EEG, EOG, and EMG components.Two PML model input tensors are finally computed from each channel.

Supplementary Module SM 4: Primary machine learning (PML) EEG/EOG/EMG-based model
Morphological aspects of biosignal data critical to sleep stage inference and interpretation are present in the time, frequency, and time-frequency domains.Hence, in this module, we employ both recurrent and convolutional neural sub-networks in a hybrid input manner to leverage such spatial and time-dependent features.Specifically, the PML model takes inputs as the spectrogram and the 38-dimensional feature vector for each epoch.The spectrogram is then fed into a shallow, 2-layer convolutional neural sub-network.On the other hand, the feature vector, in addition to 7 epochs of historical feature data, is fed to the recurrent neural sub-network to obtain an additional feature mapping.Output vectors from each subnetwork are concatenated to achieve a 928-dimensional latent feature vector which is finally presented to a single-layer dense classification head.Finally, we apply a final softmax layer 2 to the 4-unit output to achieve a probability distribution over the four sleep stage classes (i.e., W, LS, DS, and REM).At each epoch, this spectrogram and feature vectors are computed for each channel with acceptable signal quality.A forward pass through the network is performed using these data representations.If multiple channels have acceptable signal quality, each channel's sleep stage probability distributions will be averaged to obtain an ensembled sleep stage confidence estimate.Finally, the estimated sleep stage for the epoch is the sleep stage associated with the highest confidence estimate.
The PML model architecture is tuned in the k-fold cross-validation 3 and designed to enable an optimal balance of size and efficiency to perform accurate, real-time sleep scoring on a user's mobile device.Specifically, model parameters were learned using a training/validation dataset composed of 106 randomly selected sleep studies, which accounted for 68% of the 155 total sleep studies.In this process, we set k to 11 such that each fold had 96 sleep sessions for training and 10 for validation.It is worth noting that the final fold had only 6 validation sessions due to the total count of training sessions being 106.As averaged over the 11 folds, we achieved an accuracy of 84.08 ± 1.42%.Details on this model evaluation are provided in Supplementary Tab.ST 1.However, a primary challenge in developing algorithms for performing inference based on electrophysiological signals is an inter-user generalization.Many components of the EEG, EOG, and EMG signals vary substantially across people, which makes the development of models that do not overfit the data it was trained on challenging.Furthermore, signal morphology is influenced by the hardware used for data acquisition, reducing the efficacy of simply training a model trained on large data sets acquired using clinical PSG hardware.Consequently, we perform a two-step training process of pre-training and then fine-tuning to mitigate the performance issues that can arise from these challenges.
In the pre-training phase of the PML model training, we train the network using sleep data acquired from clinical-grade PSG devices.It is because the signal hallmarks that sleep technicians use to stage sleep are most visible in this gold-standard data, compared to data acquired from typical wearable devices since the hardware is carefully configured by the technician and is consistently monitored throughout the night and adjusted as needed.By pre-training the PML model on this PSG data from subjects of varying demographics (age, gender, etc.), our model learns signal features critical for scoring and generalization across users.While training, we stochastically optimize the model parameters using the Adam optimizer 4 , iteratively minimizing categorical cross-entropy loss.Early stopping was used to terminate pre-training by monitoring classification accuracy on a withheld validation dataset 5 .
Naturally, the distribution of spectrogram and feature vector values in the Earable headband data will differ from those in the clinical grade PSG data due to the different hardware configurations.The pre-trained PML model must be tuned following different data distributions.In this tuning process, we freeze the layers of the convolutional subnetwork while updating the layers of the recurrent sub-network and classification head by training on sleep data acquired using the Earable headband.Similar to the pre-training phase, we apply the Adam optimization in this iterative process.Moreover, we use a lowered learning rate and label-smoothing 6 of ground truth consensus labels to stabilize the optimization process in the presence of epochs with noisy data that do not contain data reflective of the ground truth sleep stage.